You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the completion suggester uses { fuzzy: { edit_distance: 2}}
Fuzziness in Elasticsearch refers to edit-distance, which can be set to 0,1 or 2.
min_similarity accepts a float value between 0 and 1, but now gets converted to an edit distance based on word length. eg a word of two characters with an edit distance of 2 would match any other word of length 2.
I would suggest renaming fuzziness and min_similarity to edit_distance everywhere. It should accept 0,1,2 and auto, which sets the edit_distance to 1 for words of 1..3 characters, and 2 for words of 4 characters or more.
The only fly in the ointment is the fuzzy query which also handles fuzzy numbers and dates, which have nothing to do with edit distance. See proposed deprecation in #4076
The text was updated successfully, but these errors were encountered:
I opened a pull request for this that is split into 2 commits. One commit adds the generalization in terms of naming but handles all the old naming gracefully. The other commit breaks BW compat and updates docs etc. I want to pull one of the commits into 0.90 for easier transition. I also tried to keep most of the defaults in this issue to not do N things at once. @clintongormley can you give it a review?
A lot of different API's currently use different names for the
same logical parameter. Since lucene moved away from the notion
of a `similarity` and now uses an `fuzziness` we should generalize
this and encapsulate the generation, parsing and creation of these
settings across all queries.
This commit adds a new `Fuzziness` class that handles the renaming
and generalization in a backwards compatible manner.
This commit also added a ParseField class to better support deprecated
Query DSL parameters
The ParseField class allows specifying parameger that have been deprecated.
Those parameters can be more easily tracked and removed in future version.
This also allows to run queries in `strict` mode per index to throw
exceptions if a query is executed with deprected keys.
Closeselastic#4082
Currently we have:
flt
|flt_field
havemin_similarity
fuzzy
hasmin_similarity
query_string
hasfuzzy_min_sim
match
hasfuzziness
{ fuzzy: { edit_distance: 2}}
Fuzziness in Elasticsearch refers to edit-distance, which can be set to 0,1 or 2.
min_similarity
accepts a float value between 0 and 1, but now gets converted to an edit distance based on word length. eg a word of two characters with an edit distance of 2 would match any other word of length 2.I would suggest renaming
fuzziness
andmin_similarity
toedit_distance
everywhere. It should accept 0,1,2 andauto
, which sets the edit_distance to 1 for words of 1..3 characters, and 2 for words of 4 characters or more.The only fly in the ointment is the
fuzzy
query which also handles fuzzy numbers and dates, which have nothing to do with edit distance. See proposed deprecation in #4076The text was updated successfully, but these errors were encountered: